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5 We show that Delaunay triangulations and compressed quadtrees are equivalent structures. More precisely, 

6 we give two algorithms: the first computes a compressed quadtree for a planar point set, given the Delaunay 

7 triangulation; the second finds the Delaunay triangulation, given a compressed quadtree. Both algorithms 

8 run in deterministic linear time on a pointer machine. Our work builds on and extends previous results by 

9 Krznaric and Levcopolous [38] and Buchin and Mulzer [9]. Our main tool for the second algorithm is the 

10 well-separated pair decomposition (WSPD) [12], a structure that has been used previously to find Euclidean 

11 minimum spanning trees in higher dimensions 26 . We show that knowing the WSPD (and a quadtree) 



suffices to compute a planar Euclidean minimum spanning tree (EMST) in linear time. With the EMST at 
hand, we can find the Delaunay triangulation in linear time [20] . 

As a corollary, we obtain deterministic versions of many previous algorithms related to Delaunay trian- 



gulations, such as splitting planar Delaunay triangulations 18 19 , preprocessing imprecise points for faster 
Delaunay computation [8j|40j, and transdichotomous Delaunay triangulations [9j[l4j 15 . 
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l? 1 Introduction 

is Delaunay triangulations and quadtrees are among the oldest and best-studied notions in compu- 
19 tational geometry |3, 6, 24, 28 , j42|[43|[45| , [47] , captivating the attention of researchers for almost four 

*A preliminary version appeared in Proc. 22nd SODA, pp. 1759-1777, 2011 
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(a) (b) 
Fig. 1: A planar point set P, and a quadtree (a) and a Delaunay triangulation (b) on it. 
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decades. Both are proximity structures on planar point sets; Figure [T] shows a simple example of 
these structures. Here, we will demonstrate that they are, in fact, equivalent in a very strong sense. 
Specifically, we describe two algorithms. The first computes a suitable quadtree for P, given the 
Delaunay triangulation DT(P). This algorithm closely follows a previous result by Krznaric and 
Levcopolous |38|, who solve this problem in a stronger model of computation. Our contribution 
lies in adapting their algorithm to the real RAM/pointer machine modelj^] The second algorithm, 
which is the main focus of this paper, goes in the other direction and computes DT(P), assuming 
that a suitable quadtree for P is at hand. 

The connection between quadtrees and Delaunay triangulations was first discovered and fruit- 
fully applied by Buchin and Mulzer [9] (see also While their approach is to use a hierarchy 
of quadtrees for faster conflict location in a randomized incremental construction of DT(P), we 
pursue a strategy similar to the one by Loffler and Snoeyink [4=0] : we use the additional infor- 
mation to find a connected subgraph of DT(P), from which DT(P) can be computed in linear 



deterministic time [20] . As in Loffler and Snoeyink 40 , our subgraph of choice is the Euclidean 
minimum spanning tree (EMST) for P, emst(P) |26| . The connection between quadtrees and EM- 
STs is well known: initially, quadtrees were used to obtain fast approximations to emst(P) in high 



dimensions 11,49 . Developing these ideas further, several algorithms were found that use the 
well-separated pair decomposition (WSPD) [12], or a variant thereof, to reduce EMST computation 
to solving the bichromatic closest pair problem. In that problem, we are given two point sets R 
and B, and we look for a pair (r, b) G R x B that minimizes the distance \rb\ [H |ll|[39]|51| . Given a 
quadtree for P, a WSPD for P can be found in linear time [8] |12[[l3]|33| . EMST algorithms based 
on bichromatic closest pairs constitute the fastest known solutions in higher dimensions. Our ap- 
proach is quite similar, but we focus exclusively on the plane. We use the quadtree and WSPDs 
to obtain a sequence of bichromatic closest pair problems, which then yield a sparse supergraph of 
the EMST. There are several issues: we need to ensure that the bichromatic closest pair problems 
have total linear size and can be solved in linear time, and we also need to extract the EMST from 
the supergraph in linear time. In this paper we show how to do this using the structure of the 
quadtree, combined with a partition of the point set according to angular segments similar to Yao's 



29 technique 51 



30 1.1 Applications 

31 Our two algorithms have several implications for derandomizing recent algorithms related to DTs. 



32 First, we mention hereditary computation of DTs. Chazelle et al. 18 show how to split a Delaunay 

33 triangulation in linear expected time (see also [19]). That is, given DT(P U Q), they describe a 

34 randomized algorithm to find DT(P) and DT(Q) in expected time 0(|P| + \Q\)- Knowing that DTs 

35 and quadtrees are equivalent, this result becomes almost obvious, as quadtrees are easily split in 

36 linear time. More importantly, our new algorithm achieves linear worst-case running time. Ailon 

37 et al. (2] use hereditary DTs for self-improving algorithms (2j. Together with the e-net construction 

38 by Pyrga and Ray [44] (see (2j Appendix A]), our result yields a deterministic version of their 

39 algorithm for point sets generated by a random source (the inputs are probabilistic, but not the 

40 algorithm) . 



Eppstein et al. 27 introduce the skip-quadtree and show how to turn a (compressed) quadtree 



into a skip-quadtree in linear time. Buchin and Mulzer [9] use a (randomized) skip-quadtree to 



Refer to Appendix |X] for a description of different computational models. 
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Fig. 2: We show which can be computed from which in linear time. The black arrows depict known linear time 
deterministic algorithms that work in the pointer machine/real RAM model. The red arrows depict our 
new results. Furthermore, for reference, we also show known randomized linear time algorithms (in green) 
and known deterministic linear time algorithms that work in a weaker model of computation (in blue). 



find the DT in linear expected time. This yields several improved results about computing DTs. 
Most notably, they show that in the transdichotomous setting [14,15,29 , computing DTs is no 



harder than sorting the points (according to some special order). Here, we show how to go directly 
from a quadtree to a DT, without skip-quadtrees or randomness. This gives the first deterministic 
transdichotomous reduction from DTs to sorting. 

Buchin et al. 18] use both hereditary DTs and the connection between skip-quadtrees and DTs 
to simplify and generalize an algorithm by Loffler and Snoeyink |40| to preprocess imprecise points 
for Delaunay triangulation in linear expected time (see also Devillers |25| for another simplified, but 
not worst-case optimal, solution). Loffler and Snoeyink's original algorithm is deterministic, and 
the derandomized version of the Buchin et al. algorithm proceeds in a very similar spirit. However, 
we now have an optimal deterministic solution for the generalized problem as well. 

In Figure [2j we show a graphical representation of different proximity structures on planar point 
sets. The arrows show which structures can be computed from which in linear deterministic time 
on a pointer machine, before and after this paper. Please realize that there are several subtleties of 
different algorithms and their interactions that are hard to show in a diagram, it is included purely 
as illustration of the impact of our results. 

1.2 Organization of this paper 

The main result of our paper is an algorithm to compute a minimum spanning tree of a set of 
points from a given compressed quadtree. However, before we can describe this result in Section |4j 
we need to establish the necessary tools; to this end we review several known concepts in Section [2] 
and prove some related technical lemmas in Section [3| In Section [5j we describe the algorithm to 
compute a quadtree when given the Delaunay triangulation; this is an adaptation of the algorithm 



by Krznaric and Levcopoulos 38 to the real RAM model. Finally, we detail some important 



implications of our two new algorithms in Section [6j 
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2 2 Preliminaries 



3 We review some known definitions, structures, algorithms, and their relationships. 

4 2.1 Delaunay Triangulations and Euclidean Minimum Spanning Trees 

5 Given a set P of n points in the plane, an important and extensively studied structure is the 
e Delaunay triangulation of P (3j[6j[24}[43j|47) , denoted DT(P). It can be defined as the dual graph 
7 of the Voronoi diagram, the triangulation that optimizes the smallest angle in any triangle, or in 



many other equivalent ways, and it has been proven to optimize many other different criteria 42 

The Euclidean minimum spanning tree of P, denoted emst(P), is the tree of smallest total edge 
length that has the points of P as its vertices, and it is well known that the EMST is a subgraph 
of the DT [47[ Theorem 7]. In the following, we will assume that all the pairwise distances in P 
are distinct (a general position assumption), which implies that emst(P) is uniquely determined. 
Finally, we remind the reader that emst(P), like every minimum spanning tree, has the following 
cut property: let P = R U B a partition of P, and let r and b be the two points with r € R 
and b G B that minimize the distance \rb\. Then rb is an edge of emst(P). Note that this is 
very similar to the bichromatic closest pair reduction mentioned in the introduction, but the cut 
property holds for any partition of P, whereas the bichromatic closest pair reduction requires a 
very specific decomposition of P into pairs of subsets (which is usually not a partition). 

2.2 Quadtrees — Compressed and ( Cluster 

Let P be a planar point set. The spread of P is defined as the ratio between the largest and 
the smallst distance between any two distinct points in P. A quadtree for P is a hierarchical 
decomposition of an axis- aligned bounding square for into smaller axis- aligned squares 1 3 , 28 



33 4& . A regular quadtree is constructed by successively subdividing every square with at least 
two points into four congruent child squares. A node v of a quadtree is associated with (i) S v , the 
square corresponding to v; (ii) P v , the points contained in S v ; and (iii) B v , the axis-aligned bounding 
square for P v . S v and B v are stored explicitly at the node. We write \S V \ and \B V \ for the diameter 
of S v and B v , and c v for the center of S v . We will also use the shorthand d(u,v) := d(S u ,S v ) to 
denote the shortest distance between any point in S u and any point in S v . Furthermore, we denote 
the parent of v by v. Regular quadtrees can have unbounded depth (if P has unbounded spread 
so in order to give any theoretical guarantees the concept is usually refined. In the sequel, we use 
two such variants of quadtrees, namely compressed and c-cluster quadtrees, which we show are in 
fact equivalent. 

A compressed quadtree is a quadtree in which we replace long paths of nodes with only one child 



34 by a single edge [4 5 8,21 . It has size 0(|P|). Formally, given a large constant a, an a-compressed 



quadtree is a regular quadtree with additional compressed nodesP] A compressed node v has only 



one child v with \S V \ < |SV.|/a and such that S v \ S v has no points from P. Figure 3(a) shows 
an example. Note that in our definition S v need not be aligned with S v , which would happen if 
we literally "compressed" a regular quadtree. This relaxed definition is necessary because existing 
algorithms for computing aligned compressed quadtrees use a more powerful model of computation 
than our real RAM/pointer machine (see Appendix [A]) . In the usual applications of quadtrees, this 



^ Such nodes are often called cluster- nodes in the literature |4||5||8], but we prefer the term compressed to avoid 
confusion with c-cluster quadtrees defined below. 
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(a) 



(b) 



(c) 



Fig. 3: (a) A compressed quadtree on a set of 15 points, (b) A r-cluster tree on the same point set. (c) In a 
( -cluster quadtree, the internal nodes of the r-cluster tree are replaced by quadtrees. 



is acceptable. In fact, Har-Peled |33| Chapter 2] pointed out that some non-standard operation is 
inevitable if we require that the squares of the compressed quadtree are perfectly aligned. However, 
here we intend to derandomize algorithms that work on a traditional real RAM/pointer machine, 
so we prefer to stay in this model. This keeps our results comparable with the previous work. 

Now let c be a large enough constant. A subset U C P is a c-cluster if U = P or d(U, P\U) > 
c\Bu\, where By denotes the smallest axis-aligned bounding square for U, and d(A,B) is the 
minimum distance between a point in A and a point in B 37 38 1. In other words, U is a c-cluster 



precisely if {U, P \ U} is a (l/c)-serra-separated pair |33, 50 . It is easily seen that the c-clusters for 
P form a laminar family, i.e., a set system in which any two sets A and B satisfy either A n B = 0; 
A C B; or B C A. Thus, the c-clusters define a c-cluster tree T c . Figure [3(b)| shows an example. 
These trees are a very natural way to tackle point sets of unbounded spread, and they have linear 
size. However, they also may have high degree. To avoid this, a c-cluster tree T c can be augmented 
by additional nodes, adding more structure to the parts of the point set that are not strongly 
clustered. This is done as follows. First, recall that a quadtree is called balanced if for every node 
u that is either a leaf or a compressed node, the square S u is adjacent only to squares that are 
within a factor 2 of the size of S u ^ For each internal node u of T c with set of children V, we build 
a balanced regular quadtree on a set of points containing one representative point from each node 
in V (the intuition being that such a cluster is so small and far from its neighbors, that we might 



20 as well treat it as a point). This quadtree has size 0(|V|) (Lemma 3.4), so we obtain a tree of 



constant degree and linear size, the c-cluster quadtree. Figure 3(c)| shows an example. The sets P v 
S v and B v for the c-cluster quadtree are just as for regular and compressed quadtrees, where in P v 
we expand the representative points appropriately. Note that it is possible that S v ^ P v , but the 
points of P v can never be too far from S v . In Section 3.1 we elaborate more on c-cluster quadtrees 



25 and their properties, and in Section 3.3 we prove that c-cluster quadtrees and compressed quadtrees 



are equivalent (Theorem 3.12). 



® We remind the reader that in our terminology, a compressed node is the node whose square contains a much 
smaller quadtree, and not the root node of the smaller quadtree. 
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2.3 Well-Separated Pair Decompositions 



For any two finite sets U and V, let U (8> V := {{u, | ti 6 (7,t;G y,u/ i>}. A pair decomposition 
V for a planaiQ /(-point set P is a set of m pairs {{Pi, Vi}, {U m , V m }}, such that (i) for all 
i = 1, . . . , m, we have Pi, Vi C P and Pi n Vi = 0; and (ii) for any {p, q} £ P P, there is exactly 
one i with {p, q} e Ui® Vi. We call m the size of P. Fix a constant e G (0, 1), and let {U, V} G P. 
Denote by Bjj, By the smallest axis-aligned squares containing U and V. We say that {P, V} 
is s -well-separated if max{|P[/|, |Py|} < ed{Bjj, By), where d(Bu,By) is the distance between 
Bjj and Py (i-e., the smallest distance between a point in B\j and a point in By). If {P, V"} is 
not e- well-separated, we say it is e -ill-separated. We call P an e -well- separated pair decomposition 
(s-WSPD) if all its pairs are e-well-separated (TT}[l2}[26}[33 



Now let T be a (compressed or c-cluster) quadtree for P. Given e > 0, it is well known that 



T can be used to obtain an e-WSPD for P in linear time 12 33 . Since we will need some specific 
properties of such an c-WSPD, we give pseudo-code for such an algorithm in Algorithm [TJ We call 
this algorithm wspd, and denote its output on input T by wspd(T). The correctness of the algorithm 
wspd is immediate, since it only outputs well-separated pairs, and the bounds on the running time 
and the size of wspd(T) follow from a well-known volume argument which we omit [8| |12[[l3"||33] . 

Algorithm 1 Finding a well-separated pair decomposition. 
1. Call wspd(r) on the root r of T. 

wspd(v) 

1. If v is a leaf, return 0. 

2. Return the union of wspd(u;) and wspd({wi, u^}) for all children w and pairs of distinct 
children w\,W2 of v. 

wspd({ii, v}) 

1. If S u and S v are e-well-separated, return {u,v}. 

2. Otherwise, if \S U \ < IS^I, return the union of wspd({-u, w}) for all children w of v. 

3. Otherwise, return the union of wspd({u?, v}) for all children w of u. 



Theorem 2.1. There is an algorithm wspd, that given a (compressed or c-cluster) quadtree T for 
a planar n-point set P, finds in time 0{n) a linear-size e-WSPD for P, denoted wspd(P). □ 

Note that the WSPD is not stored explicitly: we cannot afford to store all the pairs {U, V}, 
since their total size might be quadratic. Instead, wspd(T) contains pairs {u, v}, where u and v are 
nodes in T, and {u, v} is used to represent the pair {P U ,P V }. 

Note that the algorithm computes the WSPD with respect to the squares S v , instead of the 
bounding squares B v . This makes no big difference, since for compressed quadtrees B v C S v , and 
for c-cluster quadtrees B v can be outside S v only for c-cluster nodes, resulting in a loss of at most 

™ Although some of these notions extend naturally to higher dimensions, the focus of this paper is on the plane. 
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2 a factor 1 + 1/c in separation. Referring to the pseudo-code in Algorithm [TJ we now prove three 

3 observations. The first observation says that the size of the squares under consideration strictly 

4 decreases throughout the algorithm. 

5 Observation 2.2. Let {u, v} be a pair of distinct nodes ofT. If wspd({u, v}) is executed by wspd 

6 run on T (in particular, if {u,v} £ wspd(T) / ) ; then max{\S u \, \S V \} < min{|Sy, |SV|}- 

7 Proof. We use induction on the depth of the call stack for wspd({u, v}). Initially, u and v are 
b children of the same node, and the statement holds. Furthermore, assuming that wspd({-u, v}) is 

9 called by wspd({it,u}) (and hence \S U \ < \Sy\), we get max{|5 u |, \S V \} < \S^\ = min{|Sy, |SV|}, 

10 where the last equation follows by induction. □ 

11 The next observation states that the wspd-pairs reported by the algorithm are, in a sense, as 

12 high in the tree as possible. 

13 Observation 2.3. If {u, v} £ wspd(T), thenu andv are ill-separated. 

14 Proof. If u = v, the claim is obvious. Otherwise, let us assume that wspd({u, v}) was called 

15 by wspd({n, v}). This means that {u,v} is ill-separated and max{|5 u |, \Sy\} = |SV|. Therefore, 

16 max{|Sy, \ Sv\} > \Sy \ > ed(u,v) > ed(u,v), and {u, v} is ill-separated. □ 

17 The last claim shows that for each wspd-pair, we can find well-behaved boxes whose size is 
is comparable to the distance between the point sets. In the following, this will be a useful tool for 

19 making volume arguments that bound the number of wspd-pairs to consider. 

20 Claim 2.4. Let {u, v} G wspd(T). Then there exist squares R Ll and R v such that (i) S u C R u C Su 

21 and S v C R v C S^; (ii) \R U \ = \R V \; and (Hi) \R u \/2e < d(R u ,R v ) < 2\R u \/e. 

22 Proof. Suppose wspd({u, v}) is called by wspd({u, v}), the other case is symmetric. Let us define 

23 r := mm{ed(u,v),\Sv\}. By Observation 

24 well-separated, we have ed(u,v) > max{ | | , | | } . Hence, \Su\, \Sy\ > r > \S U \,\S V \, and we can 

25 pick squares R u and R v of diameter r that fulfill (i). Now (ii) holds by construction, and it remains 

26 to check (iii). First, note that d(R u ,R v ) > d(u,v) - 2r > (1 - 2e)d(u,v) > r/2e, for e < 1/4. This 

27 proves the lower bound. For the upper bound, observe that ed{u, v) < e{d{u,v)-\-\Sv\) < (l + e)|5^|, 

28 because {u, v} is ill-separated. Thus, we have ed(u,v)/2 < r, and d(R u ,R v ) < d(u,v) < 2r/e, as 

29 desired. □ 



2.2, we have |SVt|,|SV,| < \Sy\ < \Su\. Since {u,v} is 



30 3 More on Quadtrees 



In this section, we describe a few more properties of the c-cluster trees and c-cluster quadtrees 

and we prove that they are equivalent to the more standard compressed 



32 defined in Section 2.2 



33 quadtrees (Theorem 3.12). Since most of the material is very technical, we encourage the impatient 

34 reader to skip ahead to Section |4j 



35 3.1 (--Cluster Quadtrees 

36 Krznaric and Levcopolous [37j Theorem 7] showed that a c-cluster tree can be computed in linear 
i time from a Delaunay triangulation. 
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2 Theorem 3.1 (Krznaric-Levcopolous). Let P be a planar n -point set. Given a constant c > 1 and 

3 DT(P), we can find a c-cluster tree T c for P in 0(n) time and space on a pointer machine. □ 

4 Here, we will actually use a more relaxed notion of c-cluster trees: let c\, C2 be two constants 

5 with 1 < c\ < C2, and let P be a planar ?t-point set. A (ci,C2)- cluster tree Tr ClC2 \ is a rooted tree 

6 in which each inner node has at least two children and which has n leaves, one for each point in P. 

7 Each node v € Tr clC2 \ corresponds to a subset P v C P in the natural way. Every node v must fulfill 
a two properties: (i) if v is not the root, then d(P v ,P \ P v ) > ci\Bp v \; and (ii) if P v has a proper 

9 subset Q C P v with d(Q,P \ Q) > C2\Bq\, then there is a child of v with Q C P w . In other 

10 words, each node of Tr Cl>C2 \ corresponds to a ci-cluster of P, and T) Cl , C2 ) m ust have a node for every 

11 C2-cluster of P. Thus, the original c-cluster tree is also a (c, c)-cluster tree. Our relaxed definition 

12 allows for some flexibility in the construction of T( Cl)C2 ) while providing the same benefits as the 

13 original c-cluster tree. Thus, outside this section we will be slightly sloppy and not distinguish 
w between c-cluster trees and (c, 0(c))-cluster trees. 

15 As mentioned above, the tree Tr cifi2 \ is quite similar to a well-separated pair decomposition: 

16 any two unrelated nodes in T( CljC2 ) correspond to a (l/ci)-well-separated pair. However, T( ci ,c 2 ) nas 

17 the huge drawback that it may contain nodes of unbounded degree. For example, if the points in 
is P are arranged in a square grid, then T( Cl , C2 ) consists of a single root with n children. Nonetheless, 

19 Tr ci C2 \ is still useful, since it represents a decomposition of P into well-behaved pieces. As explained 

20 above, the (c±, C2)-cluster quadtree T is obtained by augmenting Tr clC2 ) with quadtree-like pieces 

21 to replace the nodes with many children. 

22 We will now prove some relevant properties of (ci, C2)-cluster quadtrees. For a node u of Tr cljC2 \, 

23 let Tu be the balanced regular quadtree on the representative points of it's children. The direct 

24 neighbors of a square S in T$ are the 8 squares of size \S\ that surround S. First, we recall how 

25 the balanced tree T® is obtained: we start with a regular (uncompressed) quadtree T' for the 

26 representative points. While T' is not balanced, we take a leaf square S of T' that is adjacent to a 

27 leaf square of size less than \S\/2 and we split S into four congruent child squares. The following 

28 theorem is well known. 

29 Theorem 3.2 (Theorem 14.4 of [3]). Let T' be a quadtree with m nodes. The above procedure 

30 yields a balanced quadtree with 0{m) nodes, and it can be implemented to run in 0{m) time. □ 

31 Let v be a child of u in T/ cl C2 y The properties of the balanced quadtree T„ and the fact that 

32 the children of u are mutually well-separated yield the following observation. 

33 Observation 3.3. If c\ is large enough, at most four leaf squares of Tu contain points from P v . 

34 Proof. Let d := \B V \ be the diameter of the bounding square for P v . By definition, P v is a ci-cluster, 

35 so the distance from any point in P v to any point in P \ P v is at least c\d. Suppose that S is a leaf 

36 square of T® with 5* n P v ^ 0, and let S be the parent of 5. 

37 There are two possible reasons for the creation of S: either 5 is part of the original regular 

38 quadtree for the representative points, or S is generated during the balancing procedure. In the 

39 former case, S contains at least two representative points. Thus, since in S there is a point from 

40 P v and a point from P \ P v , we have \S\ > c\d/2. In the latter case, S must be a direct neighbor of 

41 a square with at least two representative points (see (3j Proof of Theorem 14.4]). Therefore, since 

42 5" contains a point from P v and has a direct neighbor with a point from P\P V , the diameter of S 
i is at least c\d/A. Either way, we certainly have |5| > c\d/A. 
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Now if ci > 8, then c\d/A > 2d, so the side length of every leaf square S that intersects P v is 
strictly larger than d. Thus, P v can be covered by at most 4 such squares, and the claim follows. □ 

To see that (ci, C2)-cluster quadtrees have linear size, we need a property that is (somewhat 
implicitly) shown in |38[ Section 4.3]. 

Lemma 3.4. If u has m children v\, v^, ■ ■ ■, v m in T c , then Tu has 0(m) nodes. 

Proof. Note that the total number of nodes in Tu is proportional to the number of squares that 
contain at least two representative points. Indeed, the number of squares in a balanced regular 
quadtree is proportional to the number of squares in the corresponding unbalanced regular quadtree 



(Theorem 3.2), and in that tree the squares with at least two points correspond to the internal 
nodes, each of which has exactly four children. Thus, it suffices to show that the number of squares 
in Tu with at least two representative points is 0(m). 

Call a square S of Tu full if S contains a representative point. A full square S G Tu is called 
merged if it has at least two full children. There are 0(m) merged squares, so we only need to bound 
the number of non-merged full squares with at least two points. These squares can be charged to 
the merged squares, using the following claim. 

Claim 3.5. There exists a constant f3 (depending on ci) such that the following holds: for any full 
square S with at least two representative points, one of the j3 closest ancestors of S in (possibly 
S itself) is either merged or has a merged direct neighbor. 

Proof. Let S be a non-merged full square with at least two representative points. Since S intersects 
more than one P Vi , the definition of T/ Clj02 \ implies that the set S n P u is not a c'2-chister. Thus, 

P U \S contains a point at distance at most C2\S\ from S. Hence, S has an ancestor S' in Tu that 
is at most 0(logC2) levels above S and that has a full direct neighbor S" ^ S' (note that T„ is 
balanced, so S" actually belongs to Ta). 

We repeat the argument: since (S'US")nP u is not a C2-cluster, there is a point in P u \ (S'US") 
at distance at most c^S' U S"\ < 2c2 1 *S" | from S' U S". Thus, if we go up 0(logC2) levels in T„ , we 
either encounter a common ancestor of S' and S" , in which case we are done, or we have found a 
set S of three full squares of T® such that (i) one square in S is an ancestor of S; (ii) the squares 
in S have equal size; and (iii) the squares in S form a (topological^/) connected set. 

We keep repeating the argument while going up the tree. In each step, if we do not encounter 
a common ancestor of at least two squares in S, we can add one more full square to S. However, 
as soon as we have five squares of equal size that form a connected set, at least two of them have a 
common parent. Thus, the process stops after at most two more iterations. Furthermore, since S is 
connected, once at least two squares in S have a common parent, the parents of the other squares 
must be direct neighbors of that parent. Hence, we found an ancestor of S that is only a constant 
number of levels above S and that is merged or has a merged direct neighbor, as desired. □ 



Now we use Claim 3.5 to charge each non-merged full node with at least two representative 



3.4 



points to a merged node. Each merged node is charged at most 9- 4 s = 0(1) times, and Lemma 
follows. □ 



The proof of Lemma 3.4 implies the following, slightly stronger claim: Recall that T® was 



constructed by building a regular quadtree for the representative points for it's children, followed 
by a balancing step. Now, suppose that before the balancing step we subdivide each leaf that 
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(a) (b) 

Fig. 4: (a) A regular quadtree on a set of 8 points, (b) A slight shift of the base square may cause many new 
compressed nodes in the quadtree. 

2 contains a representative point for a ocluster C until it has size at most ad(C,P \ C), for some 

3 constant a > (if the leaf is smaller than ad(C, P \ C), we do nothing). Call the tree that results 

4 after the balancing step Tjj. 

5 Corollary 3.6. The tree T2 has 0{m) nodes. 

e Proof. We only need to worry about the additional squares created during the subdivision of the 

7 leaves. If we take such a square and go up at most log(l/a) levels in the tree, we get a square with 

8 a direct neighbor that contains a point from another cluster. Now the argument from the proof of 

9 Lemma |3.4| applies and we can charge the additional squares to merged squares, as before. □ 

10 3.2 Balancing and Shifting Compressed Quadtrees 

11 In this section, we show that it is possible to "shift" a quadtree; that is, given a compressed quadtree 

12 on a set of points P with base square R, to compute another compressed quadtree on P with a 

13 base square that is similar to R, in linear time. The main difficulty lies in the fact that the clusters 

14 in the two quadtrees can be very different, as illustrated in Figure [4} 

15 Theorem 3.7. Suppose a is a sufficiently large constant and P a planar n-point set. Furthermore, 

16 let T be an a-compressed quadtree for P with base square R, and let S be a square with S D P and 

17 \S\ = @(|i?|). Then we can construct in 0{m) time a balanced a-compressed quadtree T' for P with 
is base square S and with 0(m) nodes. 

19 The idea is to construct X" in the traditional way through repeated subdivision of the base 

20 square S, while using the information provided by T in order to speed up the point location. We 

21 will use the terms T-square and T' -square to distinguish the squares in the two trees. During the 

22 subdivision process, we maintain the partial tree T', and for each square S' of T' we keep track of 

23 the T-squares that have similar size as S' and that intersect S' (in an associated set). We call the 

24 leaves of the current partial tree the frontier of T' . In each step, we pick a frontier T'-square and 

25 split it, until we have reached a valid quadtree for P. We need to be careful in order to keep T' 

26 balanced and in order to deal with compressed nodes. The former problem is handled by starting a 
1 cascading split operation as soon as a single split makes T' unbalanced. For the latter problem, we 
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would like to treat the compressed children in the same way as the points in P, and handle them 
later recursively. However, there is a problem: during the balancing procedure, it may happen that 
a compressed child becomes too large for its parent square and should be part of the regular tree. 
In order to deal with this, we must keep track of the compressed children in the associated sets of 
the T'-squares. When we detect that a compressed child has become too large for its parent, we 
treat it like a regular square. Once we are done, we recurse on the remaining compressed children. 
Through a charging scheme, we can show that the overall work is linear in the size of T. The 
following paragraphs describe the individual steps of the algorithm in more detail. 

Initialization and Data Structures. We obtain from S a grid with squares of size in (|i?|/2, \R\], 

either by repeatedly subdividing S, if \S\ > \R\; or by repeatedly doubling S, if \S\ < \R\/2. Since 
1 5 1 = 0(|-R|), this requires a constant number of steps. Then we determine the T'-squares S[, . . . , S' k 
of that grid that intersect R (note that k < 9). Our algorithm maintains the following data 
structures: (i) a list L of active T'-squares; and (ii) for each T'-square S' a list as(S") of associated 
T-squares. We will maintain the invariant that as(S') contains the smallest T-squares that have size 
at least \S'\ and that intersect S', as well as any compressed children that are contained in such a 
T-square and that intersect S' . This invariant implies that each S' has 0(1) associated squares. We 
call a T'-square S' active if as(S") contains a T-square of size in 2|S"|) or a compressed child of 
size in [|S'|/2 2a , \S'\). Initially, we set L := {S[, ...,S' k } and as(S[) = as{S' 2 ) = ■■■ = as(S' k ) = {R}, 
fulfilling the invariant. 

The Split Operation. The basic operation of our algorithm is the split. A split takes a T'- 
square S" and subdivides it into four children S[,...,S' i . Then it computes the associated sets 
as(S^), . . . ,as(S' 4 ) as follows. For each i = 1, ... ,4, we intersect S' with all T-squares in as(S'), 
and we put those T-squares into as(S'-) that have non-empty intersection with S[. Then we replace 
each 2 n -square in as(S") that is neither a leaf, nor a compressed node, nor a compressed child by 
those of its children that have non-empty intersection with S[. Finally, we remove from as (S^) 
those compressed nodes whose compressed children have size at least \S'j\ and intersect S[. Having 
determined as(S*'), we use it to check whether S[ is active. If so, we add it to L. The split operation 
maintains the invariant about the associated sets, and it takes constant time. 

Main Body and Point-Location. We now describe the main body of our algorithm. It consists 
of phases. In each phase, we remove a T'-square 5" from L. We perform a split operation on S' as 
described above. Then, we start the balancing procedure. For this, we check the four T'-squares in 
the current frontier that are directly above, below, to the left and to the right of S' to see whether 
any of them have size 2|S"|. We put each such T'-square into a queue Q. Then, while Q is not 
empty, we remove a square N' from Q and perform a split operation on it (note that this may 
create new active squares). Furthermore, if N' is in L, we remove it from L. Finally, we consider 
the T'-squares of the current frontier directly above, below, to the left and to the right of N' . If any 
of them have size 2\N'\ and are not in Q yet, we append them to Q and continue. The balancing 
procedure, and hence the phase, ends once Q is empty. 

We continue this process until L is empty. Next, we do point-location. Let S" be a T'-square 
of the current frontier. Since L is empty, S' is associated with O(l) T-squares, all of which are 
either leaves or compressed nodes or compressed children in T. For each T-leaf that intersects S', 
we determine whether it contains a point that lies in S' . In the end, we have a set of at most four 
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(a) 



(b) 



Fig. 5: (a) A frontier square S' of T' intersects several compressed children of T. We identify the list X of T' 
squares that intersect the same children, (b) To apply the shifting algorithm recursively, we choose base 
squares R and S aligned with T and T' . 



points from P or compressed children of T that intersect S' , and we call this set the secondary 
associated set for S', denoted by as2(S"). We do this for every T'-square in the current frontier. 



The Secondary Stage. Next, the goal is to build a small compressed quadtree for the secondary 
associated set of each square in the current frontier. Of course, the tree needs to remain balanced. 
For this, we start an operation that is similar to the main body of the algorithm. We call a T'- 
square S' post-active if | as2(5")| > 2 and the smallest bounding square for the elements in as2(5") 
has size larger than |5"|/128a. We put all the post-active squares into a list Li and we proceed 
as before: we repeatedly take a post-active square from L2, split it, and then perform a balancing 
procedure. Here, the splitting operation is as follows: given a square S' , we split it into four children 
S[, . . . , S4. By comparing each child to each element in the secondary associated set as2(<S"), we 
determine the new secondary associated sets as2(<S(), . . . , as2(S' i ). We use these associated sets to 
check which children S'' (if any) are post-active and add them to L2, if necessary. This splitting 
operation takes constant time. Again, it may happen that the balancing procedure creates new 
post-active squares. We repeat this procedure until Lo is empty. 



Setting Up the Recursive Calls. After the secondary stage, there are no more post-active squares, 
so for each square S' in the current frontier we have (i) | as2(S")| < 1; or (ii) the smallest bounding 
square of as2(5") has size at most |S"|/128a. Below in Lemma 3.9 we will argue that if as2(5") 
contains a single compressed child C, then C has size at most |S"|/128a. Thus, (ii) holds in any 
case. The goal now is to set up a recursive call of the algorithm to handle the remaining compressed 
children. Unfortunately, a compressed child may intersect several leaf T'-squares, so we need to be 
careful about choosing the base squares for the recursion. 

Let S' be a square of the current frontier, and set X := {S'}. While there is a compressed 
child C in as2(^) :— Us^'ex &S2{S") that intersects the boundary of S(X) :— Us'^x^"' we ac ^ 
all the T'-squares of the current frontier that are intersected by C to X. Since T' is balanced, 
the i-th square S« that we add to A' has size at most 2 l \S'\ and hence the bounding square of 
&S2(S^) has size at most 2 I |S"|/128a. By construction, &S2(S( i >) contains at least one element that 
intersects a square in the old X, so by induction we know that after i steps the set as2(X) has 
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a bounding square of size at most 2 ?+1 |S"|/128a. it follows that the process stops after at most 
three steps (i.e., when X has four elements), because after four steps we would have a bounding 
square of size at most 2 5 |S"|/128a < \S'\/4a that is intersected by five disjoint squares of size at 



least \S'\/2 4 = |S"|/16 (since X" is balanced), which is impossible (for a large enough). Figure 5(a) 
shows an example. 

Now we put two base squares around &S2(X): a square R that is aligned with T, and a square 
S that is aligned with T . For R, if as2(X) contains only one element, we just use the bounding 
square of as2(X). If |as2pf)| > 2, then the elements of as2(X) are separated by an edge or 
a corner between leaf T-squares. Thus, we can pick a base square R for as2(X) such that (i) 
\R\ < 2 6 |5"|/128a = \S'\/2a; (ii) R is aligned with T; and (iii) the first split of R separates the 
elements in &S2(X). For S, if |X| = 1, we just use the bounding square for as2(-X"). If \X\ > 2, the 
squares in X must share a common edge or corner, and we can find a base square S such that (i) S 
contains as2{X); (ii) the first split of S produces squ ares t hat are aligned with this edge or corner 
of X; and (iii) \S\ < 2 6 |S" / |/128a = |S'|/2a. Figure 5(b) shows an example. We now construct 
an a-compressed quadtree T with base square R for the elements of &S2(X) in the obvious way. 
(If &S2(X) contains any compressed children, we reuse them as compressed children for T. This 
may lead to a violation of the condition for compressed nodes at the first level of T. However, our 
algorithm automatically treats large compressed children as active squares, so there is no problem.) 
This takes constant time. We call the algorithm recursively to shift T to the new base square S. 
Note that this leads to a valid a-compressed quadtree since either S is wholly contained in S'; or 
the first split of S produces squares that are wholly contained in the T'-leaf squares and have size 
at most |S'|/4a, while each square that intersects S has size at least |5"|/4, as T' is balanced. We 
repeat the procedure for every leaf T'-square whose secondary associated set we have not processed 
yet. 



Analysis. The resulting tree T' is a balanced a-compressed quadtree for P. It remains to prove 
that the algorithm runs in linear time. The initialization stage needs 0(1) steps. Next, we consider 
the main body of the algorithm. Since each split takes constant time, the total running time for 
the main body is proportional to the number of splits. Recall that a T'-square S' is called active 
if it is put into L, i.e., if as(S") contains a T-square of size in [|S"|, 2|S"|) or a compressed child of 
size in (\S'\/2 2a , \ S'\}. Since each T-square can cause only a constant number of T'-squares to be 
active, the total number of active T'-squares is 0(m). Thus, we can use the following lemma to 
conclude that the total number of splits in the main body of the algorithm is linear. 

Lemma 3.8. Every split in the main body of the algorithm can be charged to an active T'-square 
such that each such square is charged a constant number of times. 

Proof. If we split an active square S' , we can trivially charge the split to S". Hence, the critical 
splits are the ones during the balancing procedure. By induction on the number of steps of the 
balancing procedure, we see that if a square S' is split, there must be a square N' in the current 
partial tree T' that is a direct neighbor of S' and that has an active descendant whose removal 
from L triggered the balancing procedure]^] 

If N' has an active ancestor N that is at most five levels above A 7 ' in T' (possibly N = N'), we 
charge the split of S' to N, and we are done. Otherwise, we know that as(iV) contains at least one 
compressed child of size less than \N'\/2 2a (otherwise, N' would not have an active descendant or 

® Recall that a direct neighbor of 5" is one of the eight squares of size \S'\ that surround S' . 
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would itself be active) and T-squares of size at least 64|iV'| (otherwise, one of the five nodes above 
N' in T 1 would have been active). Now, before S' is split, there must have been a split on N': 
otherwise the active descendant of N' that triggers the split on S' would not exist. Thus, we repeat 
the argument to show that N' has a direct neighbor N" with an active descendant that triggers 
the split of S' . Note that N" ^ S' , because the split on N' happens before the split on S' . If N" 
has an active ancestor that is at most five levels higher up in T' (possibly N" itself), we are done 
again. Otherwise, we repeat the argument again. 

We claim that this process finishes after at most 16 steps. Indeed, suppose we find 17 squares 
S' = N(°\ N^ 2 ' , . . . , iV' 17 ) without stopping. We know that each N^) is a direct neighbor 
of iV^' -1 ) and that each is associated with a compressed child of size at most \S'\/2 2a and 
with T-squares of size at least 64|S"|. Since the set Uj= N^ has diameter at most 17|S"|, the set 
Uj=o as (-^^) contains at most four T-squares of size at least 64|S"|. Now each compressed child 
in an associated set as(N^) is the only child of one of these four large T-squares, so there are at 
most four of them. Furthermore, each such compressed child is intersected by at most four disjoint 
T-squares of size \S'\, so there can be at most 16 squares N^-'\ a contradiction. Hence, we can 
charge each split to an active square in the desired fashion, and the lemma follows. □ 

Next, we analyze the running time of the secondary stage. Again, the running time is propor- 
tional to the number of splits, which is bounded by the following lemma. 

Lemma 3.9. Let S' be a frontier T' -square at the beginning of the secondary stage. Then after the 
secondary stage, the subtree rooted at S' has height at most O(loga). 

Proof. Below, we will argue that for every descendant S" of S', if as2(-S"') contains a compressed 
child C, then \C\ < \S"\/2 a . For now, suppose that this holds. 

First, we claim that there are O(loga) splits to post-active descendants of S' . The secondary 
associated set as2(5") contains at most four elements, so as2(S") has at most 11 subsets with two 
or more elements. Fix such a subset A. Then S' has at most O(loga) post-active descendants 
with secondary associated set A. This is because each level of T' has at most two squares with 
secondary associated set A, and the post-active squares with secondary associated set A must have 
size between |5(^4)|/2 and 128a|-B(„4)|, where B(A) denotes the smallest bounding square for the 
elements in A. (Here we use our claim that the compressed children in the secondary associated 
set of each frontier T'-square S" are much smaller than S" .) There are only O(loga) such levels, so 
adding over all A, we see that S' has at most O(loga) post-active descendants, implying the claim. 

Each split creates at most one new level below S' , so there are only O(loga) new levels due to 
splits to post- active descendants of 5". Next, we bound the number of new levels that are created 
by splits during the balancing phases. Each balancing phase creates at most one new level below S' . 
Furthermore, by induction on the number of steps in the balancing phase, we see that the balancing 
phase was triggered by the split of a post-active square that is a descendant either of S' or of a 
direct neighbor of S'. At the beginning of the secondary stage, there are 0(1) T'-squares that are 
descendants of direct neighbors of S' (as T' is balanced). As we argued above, each of them has 
at most O(loga) post-active descendants. Thus, the balancing phases add at most O(loga) new 
levels below S' . 

Finally, we need to justify the assumption that for any descendant S" with a compressed child 
C G as 2 (S"), we have \C\ < \S"\/2 a . By construction, we have |C| < |5'|/2 2a . Suppose that S' 
has a descendant S" that violates this assumption. The square S" was created through a split 
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in the secondary stage, and suppose that S" is the first such square during the whole secondary 
stage. This means that during all previous splits, the assumption holds, so by the argument above, 
there are at most O(loga) levels below S'. This means that \S"\ > \S'\/a°^ 1 \ so we would get 
|S"|/2 2a > \C\ > \S"\/2 a > |S"|/2 2a , a contradiction (for a large enough). Thus, no S" can violate 
the assumption, as desired. □ 

The time to set up the recursion is constant for each square of the current frontier. From 



Lemmas 3.8 and 3.9, we can conclude that the total time of the algorithm is 0(m), which also 



implies that T' has 0(m) squares. This concludes the proof of Theorem 3.7 



Special Cases. We note two useful special cases of Theorem 3.7 The first one gives an analog of 



Theorem 3.2 for compressed quadtrees. 



Corollary 3.10. Let T be a a- compressed quadtree with m nodes. There exists a balanced a- 
compressed quadtree that contains T , has O(m) nodes and can be constructed in 0(m) time. 



Proof. Let R be the base square of T. We apply Theorem 3.7 with S = R. □ 



The second special case says that we can realign an uncompressed quadtree locally in any way 
we want, as long as we are willing to relax the definition of quadtree slightly^] Let P be a planar 
point set. We call a quadtree for P X-relaxed if it has at most A points of P in each leaf, and is 
otherwise a regular quadtree. 

Corollary 3.11. Let P be a planar point set and T a regular quadtree for P, with base square R. 
Let S be another square with S D P and \S\ = Q(\R\). Then we can build a \-relaxed quadtree T' 
for P with base square S in 0(\T\) time such that T' has 0(\T\) nodes. 



Proof. We apply Theorem |3.7| to T, but we stop the algorithm before the beginning of the secondary 
stage. Since each secondary associated set for a leaf square has at most four elements, and since T 
contains no compressed nodes, the resulting tree T' has the desired properties. □ 

3.3 Equivalence of Compressed and ( Cluster Quadtrees 

The goal of this section is to prove the following theorem. 

Theorem 3.12. Let P be a planar n-point set. Given a (ci,C2)- cluster quadtree on P, we can 
compute in 0(n) time an 0(c\)- compressed quadtree on P; and given an a-compressed quadtree on 
P, we can compute in 0(n) time an (a 1 ' 5 , 2a 1 ' 5 )-cluster quadtree on P. 



We present the proof of Theorem 3.12 in two lemmas. 



Lemma 3.13. Let P be a planar n-point set. Given a (01,02) -cluster quadtree T for P, we can 
compute in linear time an 0(c\)- compressed quadtree T' on P. 



® We cannot get a non-relaxed (1-relaxed) uncompressed quadtree, since two points could be arbitrarily close to 
each other if they were separated by a boundary. However, we can always turn a A-relaxed quadtree into a non-relaxed 
compressed quadtree in linear time again. 
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Proof. We construct the compressed quadtree in a top-down fashion, beginning from the root. 
Suppose that we have constructed a partial compressed quadtree T', and let q be the representative 
point for a node u in the (c\, C2)-cluster tree T/ Cl C2 ) that corresponds to T. We show how to expand 
q in X" to the corresponding quadtree T$ . 

First, we add to Ty~ a new root that is aligned with the old base square and larger by a 
constant factor, such that the old base square does not touch any boundary of the new one. Next, 
we determine by a search from q which leaf squares of T' intersect T® . By Observation 



3.3 



there 

are at most four such leaves, so this step takes constant time. (Note that since we grow the base 
square of each quadtree that we expand, it cannot happen that T$ intersects the boundary of its 
parent quadtree.) Next, we repeatedly split each leaf that intersects T® and that contains some 
other point or compressed ch ild u ntil there are no more such leaves. 

The proof of Observation 3.3 shows that every leaf square of X" that intersects has size at 



least c\d/A, where d is the size of Tu's base square. If T$ lies completely inside a leaf of T", we add 
as a compressed child to T' . If T® intersects more than one leaf square, we identify a square 
at most twice the size of Tu's base square that is aligned appropriately with the relevant edges 



of T, and apply Corollary 3.11 to shift T„ to this new base square. This results in a valid 0(c\) 
compressed quadtree in which q has been expanded. We repeat this process until all the quadtree 
pieces of T have been integrated into a large compressed quadtree. 

The total time for the top-down traversal and for the realignment procedures is linear. Fur- 



thermore, Corollary 3.6 shows that the total work for splitting the leaves of T' is also linear, since 
the points in the different clusters are (l/ci)-semi-separated. Hence, the total running time is 
linear. □ 

Lemma 3.14. Let P be a planar n-point set, and T be an a-compressed quadtree for P. Then we 
can compute in linear time a (a 1 / 5 , 2a 1 / 5 ) -cluster quadtree for P. 

Proof. We use Corollary |3 . 1 0| to balance T, but without the recursive calls for the remaining cluster 
nodes. This gives a balanced top-level quadtree T top (possibly with some compressed children of T 
now integrated in the tree), in which each leaf square is associated with at most four points from 
P or compressed children of T. Furthermore, for each leaf square S of T top , we have a bounding 
square for the associated elements that is aligned with T and has size at most \S\/a. 

We use r t0 p to identify a partial cluster quadtree, and we then recurse on the compressed 
children. We say a square S G T top is full if there is a leaf below S with a non-empty associated 
set. Otherwise, S is empty. First, we consider the squares of T top in top-down fashion and check 
for each full square S which direct neighbors of S are empty (this can be done in constant time 
since T is balanced). If S has at most three full direct neighbors, and if all these full squares share 
a common corner, we let U be a square that is aligned with S and contains the full squares (i.e., 
either U = S or U is a square of size 2|5| that contains S and its full neighbors). Next, we consider 
the squares of size \U\ in the (4a 1 / 5 + 1) x (4a 1 / 5 + 1) grid centered at U and check whether they 
are all empty (again, since T is balanced, this takes constant time). If so, the points associated 
with U define a a 1 / 5 -cluster. We put a representative point for the cluster into U, make a new 
quadtree with root U, and remove IPs children from T top . We continue until all the squares of T top 
have been traversed, and then we process all the new trees in a similar way, iterating if necessary. 
After we are done, a part of the cluster quadtree has been created, and we need to consider the 
compressed children to set up a recursion. 
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(a) (b) 

Fig. 6: (a) The set of possible directions between two unrelated nodes u and v. (b) The set of possible directions 
between well-separated pairs is small. 

For this, we consider each non-empty leaf square S of the partial tree. Let B be the bounding 
square of the associated elements of S. We know that \B\ < \S\/a, so the disc D of radius 2151a 1 / 5 
centered at B intersects at most three other leaf squares. We check for each of these leaf squares 
whether D intersects the bounding square of its associated elements. If so, we make a new bounding 
square for the union of these elements and repeat. This can happen at most twice more, because in 
each step the size of the bounding square increases by a factor of at most a 1 ' 5 . Hence, after three 
steps we have a disk D of radius 0(|i?|a 4 ' 5 ) that intersects four disjoint squares of size f}(|-B|a) 
that share a corner. Thus, D must be completely contained in those squares. This also implies 
that this procedure yields a a 1 ' 5 -cluster. For each such cluster, we create a representative point 
and an appropriate base square for the child quadtree. Then, we process the cluster recursively. In 
the end, we can prune the resulting compressed trees to remove unnecessary nodes. 



By the proof of Corollary 3.10, and since be spend only constant additional time for each square, 
this procedure takes linear time. Furthermore, as we argued above, we create only a 1 ' 5 -clusters. If 
Q C P is a 2a 1 / 5 -cluster, then Q is either contained in at most four leaf squares of T top that share 
a corner or the bounding square Bq intersects at most four squares of T top of size @{\Bq\) such 
that the surrounding (4a 1 / 5 + 1) x (4a 1 / 5 + 1) grid contains only empty squares. In either case, Q 
(or a superset) is discovered. It follows that the result is a valid (a 1 / 5 , 2a 1 / 5 )-cluster quadtree. □ 

4 From a -Cluster Quadtree to the Delaunay Triangulation 

We now come to the heart of the matter and show how to construct a DT from a WSPD. Let P be 
a set of points, and T a compressed quadtree for P. Throughout this section, £ is a small enough 
constant (say, e = 7r/400), and k is a large enough constant (e.g., k = 100). Let u and v be two 
unrelated nodes of T, i.e., neither node is an ancestor of the other. Let L uv be the set of directed 
lines that stab S u before S v . The set & U v C [0, 27r) of directions for L uv is an interval modulo 2tt 
whose extreme points correspond to the two diagonal bitangents of S u and S v , i.e., the two lines 



26 that meet S u and S v in exactly one point each and have S u and S v to different sides. Figure 6(a^ 

27 illustrates this. 

28 Observation 4.1. Let u and v be two unrelated nodes ofT, and let u be a descendant of u and v 

29 be a descendant of v. Then $> m C Q uv . 

i Proof. This is immediate, because S u C S u and Sy<^S v . □ 
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Observation 4.2. If u and v are two nodes ofT such that {u,v} is s -well- separated, then \$> uv \ < 
8e. 

Proof. Let d := |c u Cu|, D u be the disk around c u with radius ed, and D v the disk around c v with 
the same radius]^ By well-separation, S u C D u and S v C D„. Let /3 be the angle between the 
diagonal bitangents of Z) u and D v . Then \<& uv \ < /3, and /3 = 2 arcsin(e<i/|<i) = 2arcsin(2e) < 8e, 



as claimed. Figure |6(b) illustrates this. □ 



For a number (ft £ [0, 27r[ we define <I>^ := {ift mod 2-k \ ift £ [(ft — e/2, (ft + e/2]}, i.e., the set 
of all directions that differ from (ft by at most e/2. We say that an ordered pair (u,v) of nodes 
has direction (ft if ^uv H $0 ^ 0. We also say that a pair of points (p, q) has direction <p if the 
corresponding pair in the WSPD has direction (ft. The same definition also applies to an edge. For 
a given point p in the plane, we define the e-cone C ( p(p) as the cone with apex p and opening angle 
e centered around the direction (ft. 

4.1 Constructing a Supergraph of the EMST 

In the following, we abbreviate V := wspd(T). The goal of this section is to construct a graph H 
with vertex set P and 0(n) edges, such that emst(P) C H. It is well known that if we take the 
graph H' on P with edge set E := {e uv \ {u,v} £ V}, where each e uv connects the bichromatic 



closest pair for P u and P v , then H 1 contains emst(P) and has O(n) edges 26 . However, as defined, 
it is not clear how to find H' in linear time. There are several major obstacles. Firstly, even though 
the tree T has 0(n) nodes, it could be that Yl,u&T l-^l = ^(n 2 )- Secondly, even if the total size of 
all PuS was 0(n), we still need to find bichromatic closest pairs for all pairs in V. Thus, a large 
set P u might appear in many pairs of V, making the total problem size superlinear. Thirdly, we 
need to actually solve the bichromatic closest pair problems. A straightforward solution to find the 
bichromatic closest pair for sets R and B with sizes r and b would take time 0((r + 6) log(min(r, b)), 
by computing the Voronoi diagram for the smaller set and locating all points from the other set in 
it. We need to find a way to do it in linear time. 

To address these problems, we actually construct a slightly larger graph H, by partitioning the 
pairs in V according to their direction. More precisely, let Y = {0, e,2e, . . . ,(l — l)e} be a set of / 
numbers, where we assume that I = 2tt/£ is an integer. For every cf) S Y, we construct a graph Ha, 
with O(n) edges and then let H = U</,ey ^4>- Given (ft £ Y, the graph is constructed in three 
steps: 

1. For every node u G T, select a subset Z u C P u , such that X^weT \ ^u\ = 0(n), and such that 
{{p>q} I V £ Zu>Q ^ Z v ,{u,v} £ V} still contains all edges of emst(P) with orientation eft. 
This addresses the first problem by making the total set size linear. 

2. Find a subset V C V, such that each u G T appears in 0(1) pairs of V , and the set 
{{p,q} I P £ Z u ,q £ Z v ,{u,v} £ V'} contains all edges of emst(P) with orientation (ft. In 
particular, we choose for every node u £ T a subset V u C V such that V = {J u€T V u , each 
pair in V u contains u, and \V U \ = Oil). This addresses the second problem by ensuring that 
every set appears in O(l) pairs. 



^ Recall, Cu is the center point of B u . 
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(a) (b) (c) 

Fig. 7: (a) A node u in the quadtree, with \P U \ — 8. (b) The relevant wspd-pairs (in green) for the points in P u 
with direction o (up). There are also wspd-pairs between u and other nodes above and below it. (c) For 
k = 1, Z u contains those p € P u for which the lowest wspd-pair in the tree T' that involves p contains 
u. In other words, Z u has the points that do not have a green edge in both directions in (b). 



3. For every pair {u, v} £ V' , we include in the edge pq such that {p, q} is the closest pair in 
Z u <S> Z v (i.e., {p, q} = argmm^ p , q ,y eZu ^ Zv \p'q'\). Here we actually solve all the bichromatic 
closest pair problems. 

Clearly, H& has 0(n) edges, and we will show that H is indeed a supergraph of emst(P). Our 
strategy of subdividing the edges according to their orientation goes back to Yao, who used a 
similar scheme to find EMSTs in higher dimensions |51|. 



Step 1: Finding the Z u 's. Recall that we fixed a direction (f> G Y. Take the set C wspd(T) of 
pairs with direction <fi. For a pair tt € Vs, we write (u,v) for the tuple such that it = {u,v} and 
c u comes before c v in direction <j>, it is a directed pair in Va. Call a node u of T full if either (i) 
u is the root; (ii) u is a non-empty leaf; or (iii) V<f, has a directed pair (u,v). Let T' be the tree 
obtained from T by connecting every full node to its closest full ancestor, and by removing the 
other nodes. We can compute T' in linear time through a post-order traversal. Now, for every leaf 
v of T', put the point p 6 P v into the sets Z u , where u is one the ftP] closest ancestors of v in T' . 
Repeat this procedure, while changing property (iii) above so that V<f, has a directed pair (v,u). 
This takes linear time, and Y2 U &T \%u\ = 0{n). Intuitively, Z u contains those points of P u that are 
sufficiently on the outside of the point set in direction (p. Figure [7] shows an example. Variants 
of the following claim have appeared several times before f[j|51]. 



Claim 4.3. Let p G P, and let C~f(p) denote the cone with apex p and opening angle 17e centered 
around cf>. Suppose that pq is an edge of cmst(P) and q G C^(p). Then q is the nearest neighbor of 
p in Cf(p) Pi P. 

Proof. If pq is an edge of emst(P), then the lune L defined by p and q contains no point of P |3|p] 
Since the opening angle of C~£(p) is at most 7r/3, for e small enough, the intersection of C~£(p) with 
L equals the intersection of C~f(p) with the disk around p of radius \pq\- Hence, q must be the 
nearest neighbor of p in Cf(p) HP. □ 



Lemma 4.4. Let pq be an edge of emst(P) with direction d>, and let {u, v} be the corresponding 
wspd-pair. Then {p, q} £ Z u ® Z v . 



™ Recall, k is a sufficiently large constant. 

™ L is the intersection of two disks with radius \pq\ , one centered at p, the other centered at q. 
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Proof. Let w be the leaf for p, and suppose for contradiction that p ^ Z u , i.e., u is not among 
the k closest ancestors of w in T' . This means there exists a sequence ui, U2, ■ ■ ■ , Uf~, u of k + 1 
distinct ancestors of w, such that each node is an ancestor of all previous nodes and such there are 
well-separated pairs {u\, v\}, {u2, V2}, ■ ■ ■ , {uk, Vk} S V^. 

Let Cf(p) be the cone with apex p and opening angle 17s centered around <f>. By Observation 



4.2 



we have S v , S Vl , . . . , S Vk C C^(p). Furthermore, since {u, v} is well-separated, d(u, v) > \S u \/e. Now 
Claim 2.4 implies that there are squares R Ul , R Vl such that (i) S U1 C R Ul C S U2 and S Vl C i?^; 



(ii) 1-fiVj = I-Rtij; and (hi) d(R Ul , R Vl ) < 2\R Ul \/e. This means that 

d(p,P Vl ) < 2(1 + l/e)\R Ul \ < 2(1 + l/e)|5 U2 | < 2(1 + l/e)\S u \/2 k -\ 

where in the first inequality we bounded the distance between any point in R Ul and any point in 
R V1 by the distance between the squares plus their diameter (since we do not know where the points 
lie inside the squares). The second inequality comes from R Ul C S u . 2 and the third inequality is 
due to the fact that S U2 lies at least k — 1 levels below S u in T' . 

bmce 2(1 + l/e)/2 k ~ 1 < 1/e for k > 3 and since d(u,v) > \S u \/e, this contradicts the fact 



that q is the nearest neighbor of p inside C^(p) (Claim 4.3). Thus, p must lie in Z u . A symmetric 
argument shows q € Z v . □ 

Step 2: Finding the V u 's. For every node u € T, we include in V u the k shortest pairs in direction 
(p, i.e., the pairs {u, v} £ wspd(T) such that (i) c u is contained in the e-cone C^,(c u ) with apex c„ 
centered around direction (f>; and (ii) there are less than k pairs {u, v'} £ wspd(T) that fulfill (i) 
and have |c u (v| < Ic^c^j. Since A* is constant, the "P M 's can be constructed in total linear time. 
Even though each V u contains a constant number of elements, a node might still appear in many 
such sets, so we further prune the pairs: by examining the 7-Vs, determine for each v £ T the 
set Q v = {u £ T I v £ V u }- For each Q v , find the k closest neighbors (measured by the distance 
between their center points) of v in Q v , and for all other 7-Vs remove the corresponding pairs {u, v}. 
Now each node appears in only a constant number of pairs of V = LLeT ^ >u ' 

Lemma 4.5. Let pq be an edge of cmst(P) with orientation cf>, and let {u,v} be the corresponding 
wspd-pair. Then {u, v} £ V u . 

Proof. We show that v is among the k closest neighbors of u in direction <fi, a symmetric argument 
shows that u is among the k closest neighbors of v in direction We may assume that \c u c v \ = 1. 
Suppose that {u, v} is not among the k shortest pairs in direction <p. Then there is a set W of k nodes 
of T such that for all w £ W we have (i) c w £ C^(c M ); (ii) \c u c w \ < 1; and (hi) {u,w} £ wspd(T). 



By Claim 2.4, there exists for every w £ W a pair of squares R u (w),R w such that S u C R u (w) 



S w C R w and \R u (w)\ = \R W \ < 2ed(R u {w), R w ) < 2e. 

Let C^(p) be the cone with apexp and opening angle 17e centered around <fi. By Observation 
S w Q Cl(p) f° r an w ^ W . Furthermore, every S w contains a point at distance at most 1+e fromp, 



4.2 



because \c w p\ < \c w c u \ + \ c u p\ < 1 + e. Also, by Claim 4.3 every S w contains a point at distance at 
least \pq\ > \c u c v \ — \c u p\ — \qc v \ > 1 — 2e from p. Thus, since d(R u (w), R w ) < 2\R w \/e by Claim 2.4 
and d(R u (w),R w ) > 1 — 2e — 2\R W \, we get \R W \ > e/8, for e small enough. However, this implies 
that W has only a constant number of squares: all S w (and hence all R w ) intersect the annular 
segment A inside C t (p) with inner radius 1 — 2e and outer radius 1 + e (see Figure 8^ . All w £ W 
are unrelated, since they are paired with u in wspd(T). Furthermore, the set A has diameter 0(e). 
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Fig. 8: All squares /?„, intersect the region A. 



2 If w £ W is a compressed child, then R u , is contained in the parent of w and intersects no other 

3 S w /, for w' £ W. Otherwise, \S W \ > \R w \/2. Thus, if we assign to each compressed child w £ W the 

4 square R w and to each other node w £ W the square S w , we get a collection of k disjoint squares 

5 that meet A and each have diameter Q(e). Since A has diameter 0(e), there can be only a constant 

6 number of such squares, so choosing k large enough leads to a contradiction. □ 
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Step 3: Finding the Nearest Neighbors. Unlike in the previous steps, the algorithm for Step 3 
is a bit involved, so we switch the order and begin by showing correctness. 



9 Lemma 4.6. Let pq be an edge of emst(P) with direction (f> and let {u,v} be the corresponding 

10 wspd-pair. Then {p, q} is the closest pair in Z u ® Z v . 



Proof. By Lemma 4.4 we have {p, q} £ Z u (g) Z v . Furthermore, the cut property of minimum 



12 spanning trees implies that pq £ emst(Z u U Z v ), Since {u, v} is well-separated, we have 

max \p q I < min \p q I. (1) 

{p',q'}eZ u <8>Z u UZ v ®Z v {p>,q'}eZ u ®Z v 



13 Now consider an execution of KruskaPs MST algorithm on Z u U Z v [22} Chapter 23.2]. Let {p' , q'} 

14 be the closest pair in Z u ® Z v . By ([I]), the algorithm considers p'q' only after processing all edges 

15 in Z u <g> Z u U Z v (g> Z v . Hence, at that point the sets Z u and Z v are each contained in a connected 

16 component of the partial spanning tree, and cm.st(Z u \J Z v ) can have at most one edge from Z U ®Z V . 

17 Hence, it follows that {p, q} = {p',q'}, as claimed. □ 

is We now describe the algorithm. For ease of exposition, we take <j) = ir/2 (i.e., we assume 

19 that P is rotated so that <f> points in the positive y-direction) . Note that now the squares are 

20 not generally axis-aligned anymore, but this will be no problem. Given a point p £ M?, we define 

21 the four directional cones C^(p),C^(p), C^(p), and Ci(p) as the leftward, upward, rightward and 

22 downward cones with apex p and opening angle n/2. The directional cones subdivide the plane 

23 into four disjoint sectors. We will also need the extended rightward cone C%(p) with apex p and 

24 opening angle it/ 2 + 16e. 

25 Claim 4.7. Let (u,v) be a directed pair in V^, and suppose that {p,q} with p £ P u and q £ P v is 

26 the closest pair for (u, v). Then C^(p) n P u = and Ci(q) n P v : 



27 Proof. We prove the claim for C±(q), the argument for C^(p) is symmetric. We may assume that 
i \pq\ = 1. By assumption, the unit disk D centered at p contains no points of P v , so it suffices to 

^ Recall that we set <j> = tt/2, so "f and 4 mean "in direction 0" and "in direction — <f>" . 
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Fig. 9: The intersection points of D and the boundary of C^(q) lie outside S v , so S v C\Ci(q) C D. 





(b) 

Fig. 10: (a) A node u with \Z U \ = 5, and the relevant part of the quadtree, (b) The graph T. Tree edges are 
black (going right). To avoid clutter, we just show two wspd edges (green, going left). 



2 show that C^(q) n S v C D. Since {u, v} £ Vs and by Observation 4.2 the direction of the line pq 

3 differs from d> by at most 17s. Therefore, the intersections of the boundaries of C\(q) and D have 

4 distance at least v2 — 0(e) from q. However, the pair {u,v} is well-separated, so all points in P v 

5 have distance at most e from q, which implies the claim; see Figure [9| □ 

e Given a set Z u for a node u of T, we define the upper chain of Z u , \JC(Z U ) as follows: remove 

7 from Z u all points p such that C^(p) contains a point from Z u in its interior. Then sort Z u by 

8 x-coordinate and connect consecutive points by line segments. All segments of UC(Z U ) have slopes 

9 in [—1, 1]. Similarly, we define the lower chain of Z u , LC(Z U ), by requiring the cones C\^(p) for the 

10 points in LC(Z U ) to be empty. The goal now is to compute UC(Z U ) and LC(Z lt ) for all nodes u. 

11 Define a directed graph Y as follows: we create two copies of each vertex u in T, called start (u) 

12 and end(u), and we add a directed edge from start (u) to end(u) for each such vertex. Furthermore, 

13 we replace every edge uv of T (u being the parent of v) by two edges: one from start(-u) to 
u start (v), and one from end(w) to end(u). We call these edges the iree-edges. Finally, for every 

15 pair {u, v } £ wspd(T), where S v is wholly contained in the extended rightward cone C^(c u ), we 

16 create a directed edge from end(u) to start (v). These edges are called wspd-edges. Figure 10 

17 shows a small example. 

is Claim 4.8. The graph T is acyclic. 

19 Proof. Suppose C is a cycle in Y. The tree-edges form an acyclic subgraph, so C has at least one 

20 wspd-edge. Let ei, e2, . . . , e z be the sequence of wspd-edges along C, and let v\, . . . , v z be such 

21 that the endpoint of ej is of the form start (vj). Finally, write C = e\ — > C\ — > e<i — > C2 — > 

22 • • • — > e z — ± C z , where C{ is the sequence of tree-edges between two consecutive wspd-edges. Each 
1 C{ consists of a (possibly empty) sequence of start — start edges, followed by one start — end 
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(a) 



(b) 



Fig. 11: (a) A set of points, and all edges with a slope in [—1,1]. By Claim 4.9 these edges are all (possibly 
implicitly) present in T. (b) A possible ordering <r of the points that respects T. 



edge and a (possibly empty) sequence of end — end edges. Thus, the origin of the next wspd-edge 
ej+i is an end-node for an ancestor or a descendant of Vi in T. In either case, by the definition of 
wspd-edges, it follows that the leftmost point of S v . +1 lies strictly to the right of the leftmost point 
of S v .. Indeed, write e.; + i = (iti+i, i>i+i). Then S Vi+1 lies strictly to the right of S Ui+1 , because 
Sv i+1 Q CX{c Ui+l ) arid because i>j+i} is well-separated. If Ui + \ is a descendant of Uj, then 

<Su i+1 C and the leftmost point of S Ui+1 cannot lie to the left of the leftmost point of S Vi , which 
implies the claim. If is an ancestor of i>, , then all of S Vi+1 is strictly to the right of S Vi , and the 
claim follows again. Thus, the leftmost point of S Vi+1 lies strictly to the right of the leftmost point 
of S Vi and the leftmost point of S Vl lies strictly to the right of the leftmost point in S,,_ , which is 
absurd. □ 

Let <r be a topological ordering of the nodes of F, 

Claim 4.9. Any pair (p, q) of points in Z u with p <r Q satisfies q ^ C<-(p). 

Proof. Suppose for the sake of contradiction that q G C<-(p). Let v, w be the descendants of u 



such that q G P v , p G P w , and {v, w} G wspd(T). By Observation 4.2, S w lies completely in the 
extended rightward cone C^(c v ), so T has an edge from end(u) to start (w). Now the tree edges in 
r require that the leaf with q comes before end(-u) and the leaf with p comes after start (w), and 
the claim follows. □ 

Since all edges on \JC(Z U ) have slopes in [—1, 1], we immediately have the following corollary. 

Corollary 4.10. The ordering <r respects the orders of \JC(Z U ) and hC(Z u ). 

For every node u G T, let < u be the order that <r induces on the leaf nodes corresponding to 

Z u - 

Claim 4.11. All the orderings < u can be found in total time 0(n). 



Proof. To find the orderings < u , perform a topological sort on T, in linear timc^ 3 22 , Chapter 22.4]. 
With each node u of T store a list L u , initially empty. We scan the nodes of V in order. Whenever 
we see a leaf for a point p G P, we append p to the at most 2k lists L u for the nodes u with p G Z u . 
The total running time is 0(n + YlueT = 0(n), and L u is sorted according to < u for each 
u G T. □ 



m Note that V has O(n) edges, as | wspd(T)| = O(n). 
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Claim 4.12. For any node u £ T, if Z u is sorted according to < u , we can find \JC(Z U ) and LC(Z U ) 
in time 0(\Z U \). 

Proof. We can find \JC(Z U ) by a Graham- type pass through L u . An example of such a list is 



shown in Figure 11(b) That is, we scan L u from left to right, maintaining a tentative upper 
chain U, stored as a stack. Let r be the rightmost point of U. On scanning a new point p, we 
distinguish cases depending in which of the four quadrants C<_(r), C^(r), C_>.(r), or C^(r) it lies in. 



By Claim 4.1, we know that p ^ C^(r). If p £ C^(r), we discard p and continue to the next point 
in L u . If p £ (r), we pop r from {7 and reassess p from the point of view of the new rightmost 
point of U. If p £ C_>(r), we push p onto £7. 

The algorithm takes 0(|Z U |) time, because every point is pushed or popped from the stack 
at most once and because it takes constant time to decide which point to push or pop. Now we 
argue correctness. For this, we use induction in order to prove that after i steps, we have correctly 
computed the upper chain for the first i points in L u , UC(Lj). This clearly holds for the first point. 
Now consider the cases for the (i + l)-th point p. 

• If p £ Ci(r), then p is certainly not on the upper chain. Furthermore, C^(p) C Cj,(r), so p 
cannot conflict with any other point on UC(Lj), so in this case UC(Lj+i) = UC(Lj). 

• If p £ C-)-(r), then C\{p) C C^(r) and p must be on UC(Lj+i). Furthermore, every point that 
we remove from UC(Lj) has p in its upper cone and cannot be on UC(Lj+i). Now let r' be 
the first point of UC(Lj) that is not popped. Since C<_(r') C C<-(p) and since the remainder 
of UC(Lj) lies inside of C«_(r'), there are no conflicts between p and the points we have not 
popped. Thus UC(Lj + i) is computed correctly. 

• If p £ C_>(r), then C^(p) C C^(r) UC_>(r), and p is on UC(Lj+i), because C_s.(r) contains no 
points from L t . Futhermore, UC(L,;) is contained in C<_(p), so p conflicts with no point on 
UC(Lj) and the result is correct. 

This finished the inductive step and the correctness proof. The lower chain is computed in an 
analogous manner. □ 

Claim 4.13. For any node u £ T and any pair {u,v} in V u , given UC(Z U ) and LC(Z V ), we can 
find the closest pair in Z u <g> Z v in time 0(\Z U \ + \Z V \). 

Proof. Connect the endpoints of \JC(Z U ) and LC(Z V ) to obtain a simple polygon (note that the two 



new edges cannot intersect the chains, because {u,v} has direction <p = tt/2, so by Observation 4.2 
®uv Q [ 7r /2 — 8^e,7r/2 + 8|e] and all edges of the chains have slopes in [—1, 1]). Then use the 



algorithm of Chin and Wang 20 to find the constrained DT of the polygon in time 0(\Z U \ + \Z V \ 



The closest pair will appear as an edge in this DT, and hence can be found in the claimed time ^ □ 



Lemma 4.14. In total linear time, we can find for every u £ T and for every pair {u, v} £ V u the 
closest pair in Z u ^ Z v . 



^ Actually, the resulting polygon is .T-monotone, so the most difficult part of the algorithm by Chin and Wang [20], 
finding the visibility map of the polygon [16] , becomes much easier [3l] . The problem may allow a much more direct 
solution, but since we will later require Chin and Wang's algorithm in full generality, we do not pursue this direction. 
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Proof. By Claims |4.11| |4,12[ |4.13| the time to find all the closest pairs is proportional to 

°( n +Yl (\Z u \ + \Z v \)) = 0(n + Y,\Zu\) = 0(n), 

u&T{u,v}eV u «eT 

because every v appears in only a constant number of 7-Vs. □ 
Putting it together. We thus obtain the main result of this section. 

Theorem 4.15. Given a compressed quadtree T for P and wspd(T), we can find a graph H with 
0{n) edges such that H contains all edges o/emst(P). It takes 0(n) time to construct H. 



Proof. The fact that H contains the EMST follows from Lemmas |4.4[ 4.5 and 4.6 The running 



time follows from the discussion at the beginning of Steps 1 and 2 and from Lemma 4.14 □ 



4.2 Extracting the EMST 

We want to extract emst(P), but no general-purpose deterministic linear time pointer machine 
algorithm for this problem is known: the fastest such algorithm whose running time can be analyzed 
needs 0(na(n)) steps [T7J. However, the special structure of the graph H and the c-cluster quadtree 
T make it possible to achieve linear time. 

We know that H contains all EMST edges. Furthermore, by construction each edge of H 
corresponds to a wspd-pair. Thus, we can associate each edge e of H with two nodes u and v such 



that {u,v} is the wspd-pair for the endpoints of e. The pruning operation in Step 2 of Section 4.1 



ensures that each node is associated with 0(1) edges of H, and we store a list of these edges at each 



node of T. Now we use Theorem |3.12| to convert our quadtree into a c-cluster quadtree T. During 
this conversion, we can preserve the information about which edges of H are associated with which 
nodes of T, because each old square overlaps with only a constant number of new squares of similar 
size. A special case are those edges that have an endpoint associated with a compressed child. 



During the conversion of Theorem 3.12, compressed children either become regular squares (during 
the balancing operation), or they correspond to c-clusters and are replaced by representative points 
in the parent tree. In the former case, we handle the compressed child just like any regular square, 
in the latter case, we associate e with the square that contains the representative point for the 
c-cluster. 

Next, we would like ensure for each edge e of H that the associated squares in T have size 
between e\e\/2 and 2e\e\, where |e| denotes the length of c. For the endpoints that were associ- 
ated with regular squares in the original quadtree, such a square can be found by considering a 



constant number of ancestors and descendants in T, by Claim 2.4 If the associated square was a 
compressed child that has become a regular square, we may need to consider more than a constant 
number of ancestors, but each such ancestor is considered only a constant number of times, since 
the compressed child has a constant number of associated edges. If e has an endpoint that is now 
associated with a representative point, we may need to subdivide the square containing the rep- 



resentative point, but by Corollary 3.6 the total work is linear. Thus, in total linear time we can 
obtain a c-cluster tree T such that each square of T is associated with 0(1) edges of H and such 
that the two associated square of each edge e of H contain the endpoints of e and have size 0(e|e|). 

By the cut property of minimum spanning trees, emst(P) is connected within each c-cluster. 
Thus, we can process the clusters bottom-up, and we only need to find the EMST within a c- 
cluster given that the points in each child are already connected. Within this cluster, T is a regular 
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uncompressed quadtree, and we can use the structure of T to perform an appropriate variant of 
Boruvka's MST algorithm (7lp8| in linear time. 



Lemma 4.16. Let T' be a subtree of T corresponding to a c-cluster, and let E be the edges in H 
associated with T' . Then emst(P) n E can be computed in time 0{\E\ + |V(T")|). 

Proof. Let i be the size of the root square of T' . Through a level order traversal of T' we group the 
squares in V(T') by height into layers Vi, Vi, ■ ■ ., V/, (where V\ is the bottommost layer, and Vh 
contains only the root). The squares in V) have size £/2 h ~ t . As stated above, each square S has a 
constant number of associated edges in E that have one endpoint in S and length length between 
\S\/2e and 2\S\/e. To find the EMST, we subdivide the edges into sets Ei, where E L contains all 
edges with length in [i/{e2 h ~ l ), £/(e2 h ~' 1 ^ 1 )). Given the Vi, we can determine the sets E % in total 
time 0(|P| + |V(T")|), as the edges for E- L are associated only with squares in Vi- a , Vi- a+ \, . . ., 
Vi+a, for some constant a. Note that every edge in E; is crossed by 0(1) other edges in Ei, because 
all e S Ei have roughly the same length and because every pair of squares in Vi has only a constant 
number of associated edges in Ei. 

Now we compute the EMST by processing the sets Ei, . . ., Eh in order. Here is how to process 
Ei. We consider the squares in Vi. Assume that we know for each square of V; the connected 
component in the current partial EMST it meets (initially each c-cluster is its own component). 
By the cut property, every square S meets only one connected component, as S is much smaller 
than the edges in E%. Eliminate all edges in Ei between squares in the same component, and 
remove duplicate edges between each two components, keeping only the shortest of these edges 
(this takes 0(|Pi|) time with appropriate pointer manipulation). Then find the shortest edge out 
of each component and add these edges to the partial EMST. Determine the new components 
and merge their associated edge sets. This sequence of steps is called a Boruvka-phase. Perform 
Boruvka-phases until E; has no edges left. 



By the crossing-number inequality 41 , Theorem 4.3.1], the number of edges considered in each 
phase is proportional to the number r of components with an outgoing edge in that phase. Indeed, 
viewing each component as a supervertex, we have an embedding of a graph with r vertices and z 
edges such that there are 0(z) crossings (since every edge e € E,- L is crossed by 0(1) other edges 
in Ei). Thus, the crossing number inequality yields z 5 /r 2 < f3z, for some constant j3 > 0, so 
z = 0(r). Since the number of components at least halves in each phase, and since initially there 
are at most \Vi\ components, the total time for Ei is Od^l + \ Vi\). Finally, label each square in 
Vi+i with the component it meets and proceed with round i + In total, processing T takes time 
0{\V{T')\ + \E\), as desired. □ 

4.3 Finishing Up 

We conclude: 

Theorem 4.17. Let P be a planar point set and T be a compressed quadtree or a c-cluster quadtree 
for P. Then DT(P) can be computed in time 0{\P\). 



Proof. If T is a c-cluster quadtree, invoke Theorem 3.12 to convert it to a compressed quadtree 



Then use Theorem 2.1 to obtain wspd(T). Next, apply Theorem 4.15 to compute the supergraph H 



of emst(P). After that, if necessary, convert T to a c-cluster quadtree for P via Theorem 3.12, and 



apply Lemma 4.16 to each c-cluster, in a bottom- up manner, to extract emst(P). Finally, apply 



the algorithm by Chin and Wang |20| to find DT(P). All this takes time 0(|P|), as claimed. □ 
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5 From Delaunay Triangulations to -Cluster Quadtrees 



For the second direction of our equivalence we need to show how to compute a c-cluster quadtree 



for P when given DT(P). This was already done by Krznaric and Levcopolous 37 38 , but their 
algorithm works in a stronger model of computation which includes the floor function and allows 
access to data at the bit level. As argued in the introduction, we prefer the real RAM/pointer 
machine, so we need to do some work to adapt their algorithm to our computational model. In this 
section we describe how Krznaric and Levcopolous's algorithm can be modified to avoid bucketing 
and bit-twiddling techniques. The only difference is that in the resulting c-cluster quadtree the 
squares for the c-clusters are not perfectly aligned with the squares of the parent quadtree. In our 
setting, this does not matter. The goal of this section is to prove the following theorem. 

Theorem 5.1. Given DT(P), we can compute a c-cluster quadtree for P in linear deterministic 
time on a pointer machine. 

In the following, we will refer to the paper by Krznaric and Levcopolous [38 j as KL. Our 
description is meant to be self-contained; however, we refer the reader to KL for more intuition and 
a more elaborate description of the main ideas. 

5.1 Terminology 

We begin by recalling some terminology from KL. 

• neighborhood. The neighborhood of a square S of a quadtree consists of the 25 squares of 



size \S\ concentric around S (including S); see Figure 12 



direct neighborhood. The direct neighborhood of a square S consists of the 9 squares of 



size 1 5 1 directly adjacent to S (including S); see Figure 12 



star of a square. Let P be a planar point set, and let S be a square. The star of S, denoted 
by it(S), is the set of all edges e in DT(P) such that (i) e has one endpoint inside S and one 
endpoint outside the neighborhood of S; and (ii) |e| < 16|»S|, where |e| is the length of e. 

dilation. Let P be a planar point set, and G a connected plane graph with vertex set P. 
The dilation of P is the distortion between the shortest path metric in G and the Euclidean 
distance, i.e., the maximum ratio, over all pairs of distinct points p,q £ P, between the length 
of the shortest path in G from p to q, and \pq\. There are many families of planar graphs 
whose dilation is bounded by a constant |23| . In particular, for any planar point set P, the 
dilation of DT(P) is bounded by 2vr/(3 cos(vr/6)) < 2.42 (35). 



orientation. The orientation of a line segment e is the angle the line through e makes with 
the x- axis. 



5.2 Preprocessing 



By Theorem 3.1 we can obtain a c-cluster tree T c for P in linear time, given DT(P). Thus, we only 
need to construct the regular quadtrees Tu for each node u in T c . This is done by processing each 
node of T c individually. First, however, we need to perform a preprocessing step in order to find for 
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Fig. 12: The neighborhood of a square S. The direct neighbors are shown in dark blue, the others in light blue. 



4 



6 



each edge e of DT(P) the node of T c that is the least common ancestor of e's endpoints. For every 
node u £ T c , we define out (it) as the set of edges in DT(P) that have exactly one endpoint in P u and 
both endpoints in P«. Clearly, every edge is contained in exactly two sets out (it) and out(i>), where 
u and v are siblings in T c . The following is a simple variant of a lemma from KL |38[ Lemma 3]. 



Lemma 5.2 (Krznaric-Levcopolous). Let P be a planar n-point set. Given DT{P) and a c-cluster 

7 tree T c for P, the sets out(it) for every node u G T c can be found in overall 0(n) time and space 

8 on a pointer machine. 

9 Proof. KL show how to reduce the problem of determining the sets out (it) to 0(n) off-line least- 
o common ancestor (lea) queries in two appropriate trees. For the lea-queries, they invoke an al- 



gorithm by Harel and Tarjan 34 that requires the word RAM. However, since all lea-queries are 
known in advance (i.e., the queries are off-line), we may instead use an algorithm by Buchsbaum 
et al. [10, Theorem 6.1] which requires 0(n) time and space on a pointer machine. □ 



14 5.3 Processing a Single Node of T c 

15 We now describe the preprocessing that is necessary on a single node u of T c before the quadtree T$ 

16 can be constructed. Let i>i, i>2, . . . , v m be the children of it. For each child Vj, let <5, := d(P Vi , P u \P Vi ). 

17 Claim 5.3. For i = 1, . . . , m, out(i>j) contains an edge of length 5, . 

is Proof. If DT(P) contains an edge e with an endpoint in P Vi and with length Si, then e must be 

19 in out(i!j), by the definition of a c-cluster. Since cmst(P) is a subgraph of DT(P), it thus suffices 

20 to show that emst(P) contains such an edge. Consider running Kruskal's MST algorithm on P. 

21 According to the definition of a c-cluster, by the time the algorithm considers the edge e that 

22 achieves Si, the partially constructed EMST contains exactly one connected component that has 

23 precisely the points in P Vi . Therefore, e G emst(P), and the claim follows. □ 



24 Initialization. By scanning the sets out(i>j), we determine a child Vj with minimum 8j (by Claim 5.3 

25 a shortest edge in out(vj) has length Si). We may assume that j = 1. Let Si be a square that 

26 contains P Vl and that has side-length 5\/8. Let a be the smallest integer such that four squares of 
i size 2 a_1 ^i/8 cover all of P u . Lemma 3.4 implies that a = 0(m). 
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Fig. 13: The initial quadtree. 



The goal is to compute T„ , the balanced regular quadtree aligned at Si such that each P Vi 
is contained in squares of size Sj/8. To begin, we use Si to initialize T$ as the partial balanced 
quadtree T„ shown in Figure 13 Every square S of T„ stores the following fields: 

• parent: a pointer to the parent square, nil for the root; 

• children: pointers for the four children of S, nil for a leaf; 

• neighbors: links to the four orthogonal neighbors of S in the quadtree T® with size \S\ (or 
size 215*1, if no smaller neighbor exists); 

The fields parent, children, and neighbors are initialized for all the nodes in Tq. 

Lemma 5.4. The total time for the initialization phase is 0(m + l 0Ut ( v i)D- 



Proof. By Lemma 3.4, the initial size of T„ is 0(m). All other operations consist of scanning the 
out-lists or are linear in the size ofT,?. □ 
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5.4 Building the Tree T$ 

Now we build the tree Tu by a traversing DT(P) in a way reminiscent of Dijkstra's algorithm 
In their algorithm, KL make extensive use of the floor function in order to locate points inside their 
quadtree squares. The purpose of this section is to argue that this point location work can be done 
through local traversal of the quadtree, without the floor function. Refer to Algorithm[2} The heart 
of the algorithm is the procedure explore, which is initially called as explore({5i}, 2 Q_1 5i/8). The 
procedure explore builds the tree Tu level by level, beginning with the level of Si. At each point, 
it maintains a set active of all squares at the current level that contain a cluster that has already 
been processed. For each such square S, it calls a function findStar. This function returns all 
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Algorithm 2 Computing a c-cluster quadtree for the children of a r-cluster. 
explore(5, maxsize) 

1. Set active := S. 

2. Set newActive := 0. 

3. Until the squares in active have size greater than maxsize: 

(a) For every square S in active call the function f indStar(5) to determine Append 
S to newActive, if it is not present yet. 

(b) For every edge e € Usg active *(^)' ^ e nas an endpoint in an undiscovered cluster, 
call the function newCluster (5, e), and append all the squares returned by this call to 
newActive. 

(c) Set active := newActive. 
newCluster(S, e) 

1. Walk along e through the current to find the square S' of T„ that contains the other 
endpoint of e. This tracing is done by following the appropriate neighbor pointers from S. 

2. Refine Tu for the new cluster, and let S' be the set of leaf squares containing the newly 
discovered cluster. 

3. Call explore(5', size of squares in active). Afterwards, return the active squares from the 
recursive call. 



edges of the Delaunay triangulation that have one endpoint in S and have length a\S\, for a constant 
a. Using f indStar we can new clusters whose distance from the active clusters is comparable to 
the size of the squares in the current level. We will say more about the implementation f indStar 
below. For each new cluster, we call the procedure newCluster which adds more squares to T® 
to accommodate the new cluster and recursively explores the short edges out of this new cluster. 
After the recursive call has finished, we can continue the exploration of the tree at the current level. 

We now give the details for the refinement in Step [2] of newCluster: Let Vj be the cluster that 
contains the other endpoint q of e (we can find vj in constant time, since e G out(vj), and since for 
each edge we store the two clusters whose out-lists contain it). Subdivide the current leaf square 
containing q (and possibly also its neighbors if they contain points from P Vj ) in quadtree-fashion 
until P v . is contained in squares of size Sj/8. Then balance the quadtree and update the neighbor 
pointers accordingly. 

The algorithm is recursive, and at each point there exists a sequence 8±, £2, ■ ■ £z of instan- 
tiations (i.e., stack frames) of explore, where was invoked by £. L . Each has a set activej 
of active squares, such that all squares in each activej have the same size, and such that the 
squares in activej+i are not larger than the squares in active/. We say that a square is active 
if it is contained in active T := (J^activej. The neighborhood of active^ is the union of the 
neighborhoods of all boxes in active. We maintain the following invariant: 

Invariant 5.5. At all times during the execution of explore, all undiscovered (--clusters lie outside 
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the neighborhood of active 5 



Claim 5.6. Invariant 5.5 is maintained by explore. 



Proof. The set active^ only changes in Steps [l] and 3c The invariant is maintained in Step [l] 
since the size of the squares in S (i.e., Si/ 8) is chosen such that their neighborhoods can contain 
no point from any other cluster. 



Let us now consider Step 3c The set newActive contains two kinds of squares: (i) the parents 
of squares processed in the current iteration of the main loop; and (ii) squares that were added to 
newActive after a recursive call. We only need to focus on squares of type (i), since squares of 
type (ii) are already added to active T during the recursive call. Suppose that active contains 
a square S whose neighborhood has a point p £ P in an undiscovered cluster. Since S £ active , 
there is a point q £ PnS, and by the definition of neighborhood, we have d(p, q) < 3\S\. However, 
since the dilation of DT(P) is at most 2.5 [35], DT(P) contains a path tt of length at most 8151 
from p to q. Let p' be the last discovered point along tt. The point p' lies in an active square S' 
with \S'\ > \S\, and the edge e leaving p' on n has length at most 8\S'\. Therefore, e £ it(S") for 
a descendant S" of S', which contradicts the fact that p' is the last discovered point along n. □ 

Lemma 5.7. The total running time of explore, excluding the calls to f indStar, is 0(m + 

XX 1 |0Ut(Wi)|). 

Proof. All squares appearin g in act ive T are ancestors of non-empty leaf squares in the final tree 
Tu ■ Therefore, by Lemma 



3.4 



the total number of iterations for the loop in Step 3a is 0(m). 
Furthermore, *(5*) contains only edges of length 0(|S|), so every edge appears in only a constant 
number of stars. It follows that the total size of the ■^•-lists, and hence the total number of iterations 
of the loop in Step 3b is 0(^™ 1 |out(uj)|). 



It remains to bound the time for tracing the edges and balancing the tree. Since Tu is balanced 
and since if(S) contains only edges of length Od^l), the tracing along the neighbor pointers of 
an edge take s constant time (since we traverse a constant number of boxes of size 0(|5|)). By 



Invariant 



5.5 



the other endpoint of the edge is contained in a leaf square of the current Tu of size 
0(1.51). (This is because the quadtree is balanced and because the other endpoint of the edge lies 
outside the neighborhood of the active squares.) Therefore, the time to build the balanced quadtree 
for the new leaf squares containing the newly discovered cluster can be charged to the corr esponding 
nodes in the final Tu , 



5.5 



balancing 



of which there are 0{m). Furthermore, note that by Invariant 
the quadtree for the newly discovered leaf squares does not affect any descendants of the active 
squares. □ 



5.5 Implementing findStar 

KL show how to exploit the geometric properties of the Delaunay triangulation in order to imple- 
ment the function findStar, quickly. For this, they store two additional fields with each active 
square, called characteristic and shortcuts |38[ Section 6], and they explain how to maintain 
these lists throughout the procedure. This part of the algorithm works on a real RAM/pointer 
machine without any further modification, so we just state their result. 

Lemma 5.8. The total time for all calls to findStar and the maintenance of the required data 
structures is 0(m + YlT=i l out (' y i)l)- ^ 
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5.6 Putting Everything Together 

We can now finally prove Theorem |5.1| 



Proof of Theorem 5.1 First, we use Theorem 3.1 to find a c-cluster tree T c for P in 0(n) time. 
Next, we use the algorithm from Section 5.2 to preprocess the tree. By Lemma |5.2[ this also 
takes 0(n) time 
Lemmas 



5.4 

the nodes of' 



5.7 



and 



Finally, we process each node of T c using the algorithm from Section 5.3 
this takes total time Ylj 1 + \ on ^{ v j) 



By 



where the sum ranges over all 
This sum is 0(n) because there are 0{n) nodes in T c , and because every edge of 
DT(P) appears in exactly two out-lists. Hence, the total running time is linear, as claimed. □ 



6 Applications 

As mentioned in the introduction, our result yields deterministic versions of several recent ran- 
domized algorithms related to DTs. Firstly, we can immediately derandomize an algorithm for 
hereditary DTs by Chazelle et al. (18 19 : 



Corollary 6.1. Let P a planar n-point set, and let S C P. Given DT(P), we can find DT(5) in 
deterministic time 0{n) on a pointer machine. 



Proof. Use Theorem 5.1 to find a c-cluster quadtree T for P, remove the leaves for P\S from T and 
trim it appropriately ^ Finally, apply Theorem 4.17 to extract DT(5) from T, in time 0(n). □ 



Secondly, we obtain deterministic analogues of the algorithms by Buchin et al. |8J to preprocess 
imprecise point sets for faster DTs. For example, we can prove the following: 

Corollary 6.2. Let 1Z = {R\, R2, ■ ■ ■ , R n ) be a sequence of n (3-fat planar regions so that no point 
in M 2 meets more than k of them. We can preprocess 1Z in 0(n log n) deterministic time into an 
0(n)-size data structure so that given a sequence of n points P = (pi,P2, ■ ■ ■ ,Pn) with p. L £ R t for 
all i, we can find DT(P) in deterministic time 0(n log(A;//3)) on a pointer machine. 

Proof. The method of Buchin et al. [§j Theorem 4.3 and Corollary 5.6] proceeds by computing 
a representative quadtree T for 1Z. Given P, the algorithm finds for every point in P the leaf 
square of T that contains it, and then uses this information to obtain a compressed quadtree T" 
for P in time 0(nlog(k/ ft)). However, T' is skewed in the sense that not all its squares need to 
be perfectly aligned and that some squares can be cut off. However, the authors argue that even 
in this case wspd(T) takes 0(n) time and yields a linear-size WSPD [8j Appendix B]. The main 
observation |8, Observation B.l] is that any (truncated) square S in T' is adjacent to at least 
one square whose area is at least a constant fraction of the area S would have without clipping. 
Since in skewed quadtrees the size of a node is at most half the size of its parent, the argument of 



Lemma 4.4 still applies. To see that Lemma 4.5 holds, we need to check that the volume argument 



goes through. For this, note that by the main observation of Buchin et al., we can assign every 



square R w (the notation is as in the proof of Lemma 4.5) to an adjacent square of comparable size 
at distance 0(e) from A. Since every such square is charged by disjoint descendants from a constant 



number of neighbors, the volume argument still applies, and Lemma 4.5 holds. Lemma 4.14 only 
relies on well-separation and the combinatorial structure of T, and hence remains valid. Finally, in 



^ Deleting P \ S might create new (-clusters. However, since we are aiming for running time 0(n), we can apply 



Theorem 4.17 to a partly compressed quadtree that may contain long paths where every node has only one child. 
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order to apply Lemma 4.16 we need to turn X" into a r-cluster quadtree, which takes linear time 



by Theorem 3.12 Thus, the total running time is 0{n\og{k/ ft), as claimed. □ 



Finally, Buchin and Mulzer [9] showed that for word RAMs, DTs are no harder than sorting. 
We can now do it deterministically. Let sort(n) be the time to sort n integers on a w-bit word 



RAM. The best deterministic bound for sort(n) is 0(n log log n) 32 



us 



Corollary 6.3. Let P be a planar n-point set given by w-bit integers, for some word-size w > logn. 
We can find DT(P) in deterministic time 0(sort(n)) on a word RAM supporting the shuffle- 



operation. ^ 



Proof. Buchin and Mulzer [9] show how to find a compressed quadtree T for P in time 0(sort(n)), 
using the shuffle-operation. They actually do not find the squares of the quadtree, only the 
combinatorial structure of T and the bounding boxes B v . It is easily seen that the algorithm wspd 
also works in this case. 



To apply Lemma 4.4 we need to check that the sizes of the bounding boxes decrease geo- 
metrically down the tree. For this, consider a node v £ T with associated point set P v and the 
quadtree square S v (i.e., the smallest aligned square of size 2 l such that the coordinates of all 
points in P v share the first w — I bits). Let B v be the bounding box of P v , and let V be such that 
2 l +1 > \B V \ > 2 l . Clearly, B v meets at most nine aligned squares of size 2' , arranged in a 3 x 3 grid. 
Hence, any descendant v of v that is at least five levels below v must have \B V \ < \S V \ < \B v \/2, 
since after at most four (compressed) quadtree divisions the squares for B v have been separated. 
Thus, the proof of Lemma [4 . 4| goes through as before, if we choose k larger and consider every fifth 
node along the chain u\, ui, ■ ■ ■ , Uk, u. 



Lemma 4.5 still holds, because every bounding box B v is contained in a (possibly much larger) 



square S v , so the volume argument applies. Also, Lemma 4.14 only relies on well-separatedness and 
the combinatorial structure of T, so we can find the graph H in linear time. After that, it takes 
0(n) time to compute emst(P), using the transdichotomous minimum spanning tree algorithm by 



Fredman and Willard [29]. □ 
7 Conclusions 

We strengthen the connections between proximity structures in the plane and sharpen several 
known results between them. Even though our results are optimal, the underlying algorithms are 
still quite subtle, and it may be of interest to see whether some of them can be simplified. It is 
also interesting to see whether systematic derandomization techniques, like e-nets, can be useful to 
yield alternative deterministic algorithms for some of the problems considered here. Finally, some 
of the previous results also apply to higher dimensions, whereas we focus exclusively on the plane. 
Can we obtain analogous derandomizations for d > 3? 



™ For specific ranges of w, we can do better. For example, if w = 0(log?i), radix sort shows that sort(n) 
O(n) " 

z 



22 . 

^ For two w-bit words, x = x- L ...x w and y = y 1 ...,y w , we define shuf f le(a;, y) as the 2?jj-bit word 



Xiyix 2 y2 ■ ■ ■ x w y w . 
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A Computational Models 

Since our results concern different computational models, we use this appendix to describe them 
in more detail. Our two models are the real RAM/pointer machine and the word RAM. 

The Real RAM/Pointer Machine. The standard model in computational geometry is the real 
RAM. Here, data is represented as an infinite sequence of storage cells. These cells can be of two 
different types: they can store real numbers or integers. The model supports standard operations 
on these numbers in constant time, including addition, multiplication, and elementary functions 
like square-root. The floor function can be used to truncate a real number to an integer, but if 
we were allowed to use it arbitrarily, the real RAM could solve PSPACE-complete problems in 
polynomial time 46 . Therefore, we usually have only a restricted floor function at our disposal, 
and in this paper it will be banned altogether. 

The pointer machine [36] models the list processing capabilities of a computer and disallows 
the use of constant time table lookup. The data structure is modeled as a directed graph G with 
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bounded out-degree. Each node in G represents a record, with a bounded number of pointers to 
other records and a bounded number of (real or integer) data items. The algorithm can access data 
only by following pointers from the inputs (and a bounded number of global entry records); random 
access is not possible. The data can be manipulated through the usual real RAM operations (again, 
we disallow the floor function). 



Word RAM. The word RAM is essentially a real RAM without support for real numbers. How- 
ever, on a real RAM, the integers are usually treated as atomic, whereas the word RAM allows for 
powerful bit-manipulation tricks. More precisely, the word RAM represents the data as a sequence 
of u>-bit words, where w > logn (n being the problem size). Data can be accessed arbitrarily, and 
standard operations, such as Boolean operations (and, xor, shl, . . .), addition, or multiplication 
take constant time. There are many variants of the word RAM, depending on precisely which 
instructions are supported in constant time. The general consensus seems to be that any function 
in AC is acceptable]^] However, it is always preferable to rely on a set of operations as small, and 



as non-exotic, as possible. Note that multiplication is not in AC 30 . Nevertheless, it is usually 
included in the word RAM instruction set 1 29 1 . 



^ AC is the class of all functions / : {0, 1}* — > {0, 1}* that can be computed by a family of circuits (C n ) n en with 
the following properties: (i) each C„ has n inputs; (ii) there exist constants a, b, such that C n has at most an gates, 
for n £ N; (iii) there is a constant d such that for all n the length of the longest path from an input to an output 
in C n is at most d (i.e., the circuit family has bounded depth); (iv) each gate has an arbitrary number of incoming 
edges (i.e., the fan-in is unbounded). 
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